Overview

Dataset statistics

Number of variables 13
Number of observations 141432
Missing cells 1177844
Missing cells (%) 64.1%
Duplicate rows 0
Duplicate rows (%) 0.0%
Total size in memory 14.0 MiB
Average record size in memory 104.0 B

Variable types

DateTime 1
Categorical 2
Numeric 8
Unsupported 2

Dataset

Description Returns the geocoordinates of where the phone is located To compare each sensor observation, the frequency was reduced to one minute. The first non-missing name is reported for each of the categorical variables.
Creator Matteo Busso, Massimo Stefan
Author Fausto Giunchiglia, Ivano Bison, Matteo Busso, Ronald Chenu-Abente, Marcelo Rodas Britez, Can Gunel, Giuseppe Veltri, Amalia de Götzen, Peter Kun, Amarsanaa Ganbold, Altangerel Chagnaa, George Gaskell, Miriam Bidoglia, Luca Cernuzzi, Alethia Hume, Jose Luis Zarza, Daniele Miorandi, Carlo Caprini
URL
Copyright (c) KnowDive 2022

Variable descriptions

university University where the experiment took place
experimentid Experiment Id
userid User id
day day showing month(2), day(2)
timestamp show month(2), day(2), hour(2), minute(2), second(2), decimals(3)
accuracy The GPS accuracy in meters
bearing The compass direction from the current position the intended destination. Bearing is measured in degrees and calculated clockwise from true north (e.g., the bearing for the direction of east is 090°)
lucene NO DESCRIPTION
latitude Geographic coordinate that specifies the N/S position. Latitude is an angle which ranges from 0° at the Equator to 90° at the poles. It is expressed in sexadecimal notation.
longitude Geographic coordinate that specifies the E/W position. Longitude is an angle which ranges from 0° at the prime Meridian to 180°. It is expressed in sexadecimal notation
altitude Elevation above sea level in meters.
provider It indicates whether the coordinates were found using the network/Wi-Fi It indicates whether the coordinates were found using GPS
speed The speed of the device, measured in meters/second over ground

Alerts

bearing is highly correlated with altitude and 1 other fields High correlation
altitude is highly correlated with bearing and 1 other fields High correlation
speed is highly correlated with bearing and 1 other fields High correlation
userid is highly correlated with day High correlation
day is highly correlated with userid High correlation
altitude is highly correlated with speed High correlation
speed is highly correlated with altitude High correlation
bearing is highly correlated with altitude and 1 other fields High correlation
altitude is highly correlated with bearing and 1 other fields High correlation
speed is highly correlated with bearing and 1 other fields High correlation
experimentid is highly correlated with university High correlation
university is highly correlated with experimentid High correlation
university is highly correlated with experimentid and 4 other fields High correlation
experimentid is highly correlated with university and 4 other fields High correlation
userid is highly correlated with university and 4 other fields High correlation
day is highly correlated with university and 4 other fields High correlation
latitude is highly correlated with university and 6 other fields High correlation
longitude is highly correlated with university and 5 other fields High correlation
altitude is highly correlated with latitude and 1 other fields High correlation
speed is highly correlated with latitude and 2 other fields High correlation
university has 89498 (63.3%) missing values Missing
experimentid has 89498 (63.3%) missing values Missing
userid has 89498 (63.3%) missing values Missing
day has 89498 (63.3%) missing values Missing
accuracy has 89498 (63.3%) missing values Missing
bearing has 89498 (63.3%) missing values Missing
lucene has 141432 (100.0%) missing values Missing
latitude has 89498 (63.3%) missing values Missing
longitude has 89498 (63.3%) missing values Missing
altitude has 89498 (63.3%) missing values Missing
provider has 141432 (100.0%) missing values Missing
speed has 89498 (63.3%) missing values Missing
speed is highly skewed (γ1 = 35.26145027) Skewed
timestamp has unique values Unique
lucene is an unsupported type, check if it needs cleaning or further analysis Unsupported
provider is an unsupported type, check if it needs cleaning or further analysis Unsupported
bearing has 6267 (4.4%) zeros Zeros
speed has 5006 (3.5%) zeros Zeros

Reproduction

Analysis started 2022-07-04 17:11:36.322731
Analysis finished 2022-07-04 17:11:57.247427
Duration 20.92 seconds
Software version pandas-profiling v3.2.0
Download configuration config.json

Variables

timestamp
Date

UNIQUE

show month(2), day(2), hour(2), minute(2), second(2), decimals(3)

Distinct 141432
Distinct (%) 100.0%
Missing 0
Missing (%) 0.0%
Memory size 1.1 MiB
Minimum 1900-03-15 17:34:00
Maximum 1900-06-21 22:45:00
2022-07-04T19:11:57.405280 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:11:57.685908 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

university
Categorical

HIGH CORRELATION
HIGH CORRELATION
MISSING

University where the experiment took place

Distinct 5
Distinct (%) < 0.1%
Missing 89498
Missing (%) 63.3%
Memory size 1.1 MiB
unitn
22678
num
14151
lse
8785
uc
3862
aau
2458

Length

Max length 5
Median length 3
Mean length 3.798975623
Min length 2

Characters and Unicode

Total characters 197296
Distinct characters 10
Distinct categories 1 ?
Distinct scripts 1 ?
Distinct blocks 1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique 0 ?
Unique (%) 0.0%

Sample

1st row aau
2nd row aau
3rd row aau
4th row aau
5th row aau

Common Values

Value Count Frequency (%)
unitn 22678
16.0%
num 14151
10.0%
lse 8785
6.2%
uc 3862
2.7%
aau 2458
1.7%
(Missing) 89498
63.3%

Length

2022-07-04T19:11:57.943071 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-07-04T19:11:58.179767 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
Value Count Frequency (%)
unitn 22678
43.7%
num 14151
27.2%
lse 8785
16.9%
uc 3862
7.4%
aau 2458
4.7%

Most occurring characters

Value Count Frequency (%)
n 59507
30.2%
u 43149
21.9%
i 22678
11.5%
t 22678
11.5%
m 14151
7.2%
l 8785
4.5%
s 8785
4.5%
e 8785
4.5%
a 4916
2.5%
c 3862
2.0%

Most occurring categories

Value Count Frequency (%)
Lowercase Letter 197296
100.0%

Most frequent character per category

Lowercase Letter
Value Count Frequency (%)
n 59507
30.2%
u 43149
21.9%
i 22678
11.5%
t 22678
11.5%
m 14151
7.2%
l 8785
4.5%
s 8785
4.5%
e 8785
4.5%
a 4916
2.5%
c 3862
2.0%

Most occurring scripts

Value Count Frequency (%)
Latin 197296
100.0%

Most frequent character per script

Latin
Value Count Frequency (%)
n 59507
30.2%
u 43149
21.9%
i 22678
11.5%
t 22678
11.5%
m 14151
7.2%
l 8785
4.5%
s 8785
4.5%
e 8785
4.5%
a 4916
2.5%
c 3862
2.0%

Most occurring blocks

Value Count Frequency (%)
ASCII 197296
100.0%

Most frequent character per block

ASCII
Value Count Frequency (%)
n 59507
30.2%
u 43149
21.9%
i 22678
11.5%
t 22678
11.5%
m 14151
7.2%
l 8785
4.5%
s 8785
4.5%
e 8785
4.5%
a 4916
2.5%
c 3862
2.0%

experimentid
Categorical

HIGH CORRELATION
HIGH CORRELATION
MISSING

Experiment Id

Distinct 2
Distinct (%) < 0.1%
Missing 89498
Missing (%) 63.3%
Memory size 1.1 MiB
wenet
29256
wenetUnitn
22678

Length

Max length 10
Median length 5
Mean length 7.183348096
Min length 5

Characters and Unicode

Total characters 373060
Distinct characters 6
Distinct categories 2 ?
Distinct scripts 1 ?
Distinct blocks 1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique 0 ?
Unique (%) 0.0%

Sample

1st row wenet
2nd row wenet
3rd row wenet
4th row wenet
5th row wenet

Common Values

Value Count Frequency (%)
wenet 29256
20.7%
wenetUnitn 22678
16.0%
(Missing) 89498
63.3%

Length

2022-07-04T19:11:58.393489 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-07-04T19:11:58.608427 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
Value Count Frequency (%)
wenet 29256
56.3%
wenetunitn 22678
43.7%

Most occurring characters

Value Count Frequency (%)
e 103868
27.8%
n 97290
26.1%
t 74612
20.0%
w 51934
13.9%
U 22678
6.1%
i 22678
6.1%

Most occurring categories

Value Count Frequency (%)
Lowercase Letter 350382
93.9%
Uppercase Letter 22678
6.1%

Most frequent character per category

Lowercase Letter
Value Count Frequency (%)
e 103868
29.6%
n 97290
27.8%
t 74612
21.3%
w 51934
14.8%
i 22678
6.5%
Uppercase Letter
Value Count Frequency (%)
U 22678
100.0%

Most occurring scripts

Value Count Frequency (%)
Latin 373060
100.0%

Most frequent character per script

Latin
Value Count Frequency (%)
e 103868
27.8%
n 97290
26.1%
t 74612
20.0%
w 51934
13.9%
U 22678
6.1%
i 22678
6.1%

Most occurring blocks

Value Count Frequency (%)
ASCII 373060
100.0%

Most frequent character per block

ASCII
Value Count Frequency (%)
e 103868
27.8%
n 97290
26.1%
t 74612
20.0%
w 51934
13.9%
U 22678
6.1%
i 22678
6.1%

userid
Real number (ℝ ≥0 )

HIGH CORRELATION
HIGH CORRELATION
MISSING

User id

Distinct 54
Distinct (%) 0.1%
Missing 89498
Missing (%) 63.3%
Infinite 0
Infinite (%) 0.0%
Mean 41.38387184
Minimum 1
Maximum 239
Zeros 0
Zeros (%) 0.0%
Negative 0
Negative (%) 0.0%
Memory size 1.1 MiB
2022-07-04T19:11:59.014342 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum 1
5-th percentile 4
Q1 17
median 36
Q3 63
95-th percentile 74
Maximum 239
Range 238
Interquartile range (IQR) 46

Descriptive statistics

Standard deviation 32.37887885
Coefficient of variation (CV) 0.7824033232
Kurtosis 8.384741174
Mean 41.38387184
Median Absolute Deviation (MAD) 23
Skewness 2.004925283
Sum 2149230
Variance 1048.391796
Monotonicity Not monotonic
2022-07-04T19:11:59.304157 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Value Count Frequency (%)
74 4340
3.1%
20 3578
2.5%
28 3300
2.3%
11 2702
1.9%
70 2610
1.8%
62 2443
1.7%
63 2274
1.6%
22 2246
1.6%
17 2100
1.5%
4 1897
1.3%
Other values (44) 24444
17.3%
(Missing) 89498
63.3%
Value Count Frequency (%)
1 689
0.5%
2 16
< 0.1%
3 161
0.1%
4 1897
1.3%
5 959
0.7%
6 560
0.4%
7 415
0.3%
8 826
0.6%
9 292
0.2%
10 8
< 0.1%
Value Count Frequency (%)
239 345
0.2%
132 1429
1.0%
124 28
< 0.1%
75 588
0.4%
74 4340
3.1%
73 13
< 0.1%
72 6
< 0.1%
70 2610
1.8%
67 1589
1.1%
65 295
0.2%

day
Real number (ℝ ≥0 )

HIGH CORRELATION
HIGH CORRELATION
MISSING

day showing month(2), day(2)

Distinct 42
Distinct (%) 0.1%
Missing 89498
Missing (%) 63.3%
Infinite 0
Infinite (%) 0.0%
Mean 459.6861979
Minimum 315
Maximum 621
Zeros 0
Zeros (%) 0.0%
Negative 0
Negative (%) 0.0%
Memory size 1.1 MiB
2022-07-04T19:11:59.589761 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum 315
5-th percentile 318
Q1 325
median 403
Q3 610
95-th percentile 618
Maximum 621
Range 306
Interquartile range (IQR) 285

Descriptive statistics

Standard deviation 135.991151
Coefficient of variation (CV) 0.2958347491
Kurtosis -1.879616316
Mean 459.6861979
Median Absolute Deviation (MAD) 85
Skewness 0.1638212076
Sum 23873343
Variance 18493.59315
Monotonicity Increasing
2022-07-04T19:11:59.844961 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=42)
Value Count Frequency (%)
326 1440
1.0%
329 1440
1.0%
318 1440
1.0%
328 1440
1.0%
320 1440
1.0%
322 1440
1.0%
327 1440
1.0%
324 1440
1.0%
325 1440
1.0%
605 1439
1.0%
Other values (32) 37535
26.5%
(Missing) 89498
63.3%
Value Count Frequency (%)
315 154
0.1%
316 888
0.6%
317 1357
1.0%
318 1440
1.0%
319 1437
1.0%
320 1440
1.0%
321 1439
1.0%
322 1440
1.0%
323 1438
1.0%
324 1440
1.0%
Value Count Frequency (%)
621 105
0.1%
620 693
0.5%
619 807
0.6%
618 1102
0.8%
617 1382
1.0%
616 1411
1.0%
615 1342
0.9%
614 1424
1.0%
613 1406
1.0%
612 1410
1.0%

accuracy
Real number (ℝ ≥0 )

MISSING

The GPS accuracy in meters

Distinct 14674
Distinct (%) 28.3%
Missing 89498
Missing (%) 63.3%
Infinite 0
Infinite (%) 0.0%
Mean 107.7841582
Minimum 0
Maximum 4979.635742
Zeros 9
Zeros (%) < 0.1%
Negative 0
Negative (%) 0.0%
Memory size 1.1 MiB
2022-07-04T19:12:00.126896 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum 0
5-th percentile 9.5
Q1 15.5
median 20
Q3 28
95-th percentile 600
Maximum 4979.635742
Range 4979.635742
Interquartile range (IQR) 12.5

Descriptive statistics

Standard deviation 354.5019987
Coefficient of variation (CV) 3.288999095
Kurtosis 23.76445124
Mean 107.7841582
Median Absolute Deviation (MAD) 4.980999947
Skewness 4.763849282
Sum 5597662.47
Variance 125671.6671
Monotonicity Not monotonic
2022-07-04T19:12:00.418320 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Value Count Frequency (%)
20 6119
4.3%
100 527
0.4%
4 483
0.3%
6 479
0.3%
1799.999023 471
0.3%
500 432
0.3%
16 365
0.3%
26.39999962 345
0.2%
1700 343
0.2%
8 338
0.2%
Other values (14664) 42032
29.7%
(Missing) 89498
63.3%
Value Count Frequency (%)
0 9
< 0.1%
0.75 5
< 0.1%
1 14
< 0.1%
1.5 21
< 0.1%
2 61
< 0.1%
2.5 50
< 0.1%
3 174
0.1%
3.182858467 1
< 0.1%
3.21600008 68
< 0.1%
3.34194231 1
< 0.1%
Value Count Frequency (%)
4979.635742 1
< 0.1%
4300 1
< 0.1%
4197 1
< 0.1%
3799.999023 1
< 0.1%
3700 1
< 0.1%
3599.999023 1
< 0.1%
3500 2
< 0.1%
3448 1
< 0.1%
3200 1
< 0.1%
3099.999023 12
< 0.1%

bearing
Real number (ℝ)

HIGH CORRELATION
HIGH CORRELATION
MISSING
ZEROS

The compass direction from the current position the intended destination. Bearing is measured in degrees and calculated clockwise from true north (e.g., the bearing for the direction of east is 090°)

Distinct 4541
Distinct (%) 8.7%
Missing 89498
Missing (%) 63.3%
Infinite 0
Infinite (%) 0.0%
Mean 20.12325471
Minimum -1
Maximum 359.9
Zeros 6267
Zeros (%) 4.4%
Negative 39609
Negative (%) 28.0%
Memory size 1.1 MiB
2022-07-04T19:12:00.704773 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum -1
5-th percentile -1
Q1 -1
median -1
Q3 -1
95-th percentile 200.811
Maximum 359.9
Range 360.9
Interquartile range (IQR) 0

Descriptive statistics

Standard deviation 68.2312274
Coefficient of variation (CV) 3.390665595
Kurtosis 11.03157383
Mean 20.12325471
Median Absolute Deviation (MAD) 0
Skewness 3.441740688
Sum 1045081.11
Variance 4655.500392
Monotonicity Not monotonic
2022-07-04T19:12:00.976165 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Value Count Frequency (%)
-1 39609
28.0%
0 6267
4.4%
205.51 42
< 0.1%
40 10
< 0.1%
266.02 9
< 0.1%
175 7
< 0.1%
340.8 6
< 0.1%
299 6
< 0.1%
95.28 6
< 0.1%
348.6 5
< 0.1%
Other values (4531) 5967
4.2%
(Missing) 89498
63.3%
Value Count Frequency (%)
-1 39609
28.0%
0 6267
4.4%
0.02 1
< 0.1%
0.05 1
< 0.1%
0.2 3
< 0.1%
0.3 2
< 0.1%
0.37 1
< 0.1%
0.4 2
< 0.1%
0.45 1
< 0.1%
0.5 1
< 0.1%
Value Count Frequency (%)
359.9 4
< 0.1%
359.8 3
< 0.1%
359.71 1
< 0.1%
359.7 1
< 0.1%
359.68 1
< 0.1%
359.6 3
< 0.1%
359.5 3
< 0.1%
359.49 1
< 0.1%
359.47 1
< 0.1%
359.4 3
< 0.1%

lucene
Unsupported

MISSING
REJECTED
UNSUPPORTED

NO DESCRIPTION

Missing 141432
Missing (%) 100.0%
Memory size 1.1 MiB

latitude
Real number (ℝ)

HIGH CORRELATION
MISSING

Geographic coordinate that specifies the N/S position. Latitude is an angle which ranges from 0° at the Equator to 90° at the poles. It is expressed in sexadecimal notation.

Distinct 3402
Distinct (%) 6.6%
Missing 89498
Missing (%) 63.3%
Infinite 0
Infinite (%) 0.0%
Mean 41.72893268
Minimum -25.5307
Maximum 55.7154
Zeros 0
Zeros (%) 0.0%
Negative 3862
Negative (%) 2.7%
Memory size 1.1 MiB
2022-07-04T19:12:01.267815 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum -25.5307
5-th percentile -25.2936
Q1 45.3562
median 47.9111
Q3 49.811
95-th percentile 51.529
Maximum 55.7154
Range 81.2461
Interquartile range (IQR) 4.4548

Descriptive statistics

Standard deviation 19.74745999
Coefficient of variation (CV) 0.4732318496
Kurtosis 7.068208706
Mean 41.72893268
Median Absolute Deviation (MAD) 2.5549
Skewness -2.940769513
Sum 2167150.39
Variance 389.9621761
Monotonicity Not monotonic
2022-07-04T19:12:01.539514 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Value Count Frequency (%)
45.3562 2957
2.1%
51.4585 2295
1.6%
49.8111 1824
1.3%
45.6343 1785
1.3%
51.5026 1682
1.2%
-25.2936 1636
1.2%
45.8894 1220
0.9%
-25.5218 1181
0.8%
47.9184 1023
0.7%
45.0782 976
0.7%
Other values (3392) 35355
25.0%
(Missing) 89498
63.3%
Value Count Frequency (%)
-25.5307 3
< 0.1%
-25.5306 3
< 0.1%
-25.526 1
< 0.1%
-25.5257 2
< 0.1%
-25.5255 1
< 0.1%
-25.5251 1
< 0.1%
-25.5248 2
< 0.1%
-25.5239 5
< 0.1%
-25.5229 1
< 0.1%
-25.5222 1
< 0.1%
Value Count Frequency (%)
55.7154 1
< 0.1%
55.7148 1
< 0.1%
55.7131 2
< 0.1%
55.713 3
< 0.1%
55.7129 3
< 0.1%
55.7128 12
< 0.1%
55.7127 1
< 0.1%
55.7126 2
< 0.1%
55.7125 1
< 0.1%
55.7123 1
< 0.1%

longitude
Real number (ℝ)

HIGH CORRELATION
MISSING

Geographic coordinate that specifies the E/W position. Longitude is an angle which ranges from 0° at the prime Meridian to 180°. It is expressed in sexadecimal notation

Distinct 3951
Distinct (%) 7.6%
Missing 89498
Missing (%) 63.3%
Infinite 0
Infinite (%) 0.0%
Mean 30.28057019
Minimum -57.6377
Maximum 114.5797
Zeros 0
Zeros (%) 0.0%
Negative 13452
Negative (%) 9.5%
Memory size 1.1 MiB
2022-07-04T19:12:01.832902 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum -57.6377
5-th percentile -55.02769
Q1 -0.0373
median 11.6359
Q3 106.8142
95-th percentile 106.9338
Maximum 114.5797
Range 172.2174
Interquartile range (IQR) 106.8515

Descriptive statistics

Standard deviation 49.86865754
Coefficient of variation (CV) 1.646886344
Kurtosis -0.7764268798
Mean 30.28057019
Median Absolute Deviation (MAD) 11.7058
Skewness 0.5218866345
Sum 1572591.132
Variance 2486.883005
Monotonicity Not monotonic
2022-07-04T19:12:02.105191 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Value Count Frequency (%)
11.6359 2940
2.1%
-0.9112 2152
1.5%
106.8142 2058
1.5%
9.9568 1708
1.2%
-0.0738 1575
1.1%
-57.6364 1565
1.1%
11.4234 1504
1.1%
11.042 1180
0.8%
11.7868 887
0.6%
-54.6357 882
0.6%
Other values (3941) 35483
25.1%
(Missing) 89498
63.3%
Value Count Frequency (%)
-57.6377 2
< 0.1%
-57.6376 4
< 0.1%
-57.6375 20
< 0.1%
-57.6374 165
0.1%
-57.6373 2
< 0.1%
-57.6371 2
< 0.1%
-57.637 3
< 0.1%
-57.6369 3
< 0.1%
-57.6368 13
< 0.1%
-57.6367 14
< 0.1%
Value Count Frequency (%)
114.5797 1
< 0.1%
114.5775 1
< 0.1%
114.57 1
< 0.1%
114.5695 1
< 0.1%
114.5488 1
< 0.1%
114.5474 1
< 0.1%
114.5406 1
< 0.1%
114.5395 1
< 0.1%
114.5348 1
< 0.1%
114.5345 2
< 0.1%

altitude
Real number (ℝ)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING

Elevation above sea level in meters.

Distinct 7783
Distinct (%) 15.0%
Missing 89498
Missing (%) 63.3%
Infinite 0
Infinite (%) 0.0%
Mean 140.7759564
Minimum -612.6527
Maximum 11735.5215
Zeros 13
Zeros (%) < 0.1%
Negative 39677
Negative (%) 28.1%
Memory size 1.1 MiB
2022-07-04T19:12:02.393639 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum -612.6527
5-th percentile -1
Q1 -1
median -1
Q3 -1
95-th percentile 1278
Maximum 11735.5215
Range 12348.1742
Interquartile range (IQR) 0

Descriptive statistics

Standard deviation 433.5180995
Coefficient of variation (CV) 3.079489641
Kurtosis 165.5210308
Mean 140.7759564
Median Absolute Deviation (MAD) 0
Skewness 8.308967639
Sum 7311058.52
Variance 187937.9426
Monotonicity Not monotonic
2022-07-04T19:12:02.672606 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Value Count Frequency (%)
-1 39609
28.0%
57.9373 42
< 0.1%
1253 41
< 0.1%
224 39
< 0.1%
1252 36
< 0.1%
1259 34
< 0.1%
1255 34
< 0.1%
1267 34
< 0.1%
1266 33
< 0.1%
1263 32
< 0.1%
Other values (7773) 12000
8.5%
(Missing) 89498
63.3%
Value Count Frequency (%)
-612.6527 1
< 0.1%
-537.4575 1
< 0.1%
-493.5824 1
< 0.1%
-308.3714 1
< 0.1%
-256.1955 1
< 0.1%
-243.6156 1
< 0.1%
-229.0302 1
< 0.1%
-180 1
< 0.1%
-158.5188 1
< 0.1%
-113.5008 1
< 0.1%
Value Count Frequency (%)
11735.5215 1
< 0.1%
11689.9072 1
< 0.1%
11689.6738 1
< 0.1%
11680.8252 1
< 0.1%
11361.2246 1
< 0.1%
11354.6992 1
< 0.1%
11349.2764 1
< 0.1%
11340.2432 1
< 0.1%
11331.4834 1
< 0.1%
11321.7539 1
< 0.1%

provider
Unsupported

MISSING
REJECTED
UNSUPPORTED

It indicates whether the coordinates were found using the network/Wi-Fi It indicates whether the coordinates were found using GPS

Missing 141432
Missing (%) 100.0%
Memory size 1.1 MiB

speed
Real number (ℝ)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING
SKEWED
ZEROS

The speed of the device, measured in meters/second over ground

Distinct 1225
Distinct (%) 2.4%
Missing 89498
Missing (%) 63.3%
Infinite 0
Infinite (%) 0.0%
Mean 0.1162155811
Minimum -1
Maximum 244.9700012
Zeros 5006
Zeros (%) 3.5%
Negative 39609
Negative (%) 28.0%
Memory size 1.1 MiB
2022-07-04T19:12:02.955483 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum -1
5-th percentile -1
Q1 -1
median -0.009999999776
Q3 -0.009999999776
95-th percentile 1.24000001
Maximum 244.9700012
Range 245.9700012
Interquartile range (IQR) 0.9900000002

Descriptive statistics

Standard deviation 5.172068915
Coefficient of variation (CV) 44.50409203
Kurtosis 1527.904085
Mean 0.1162155811
Median Absolute Deviation (MAD) 0.01999999955
Skewness 35.26145027
Sum 6035.539987
Variance 26.75029687
Monotonicity Not monotonic
2022-07-04T19:12:03.237545 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Value Count Frequency (%)
-0.009999999776 20923
14.8%
-1 18686
13.2%
0 5006
3.5%
0.009999999776 85
0.1%
0.2899999917 82
0.1%
0.2199999988 76
0.1%
0.1899999976 75
0.1%
0.2599999905 73
0.1%
0.3899999857 72
0.1%
0.4799999893 72
0.1%
Other values (1215) 6784
4.8%
(Missing) 89498
63.3%
Value Count Frequency (%)
-1 18686
13.2%
-0.009999999776 20923
14.8%
0 5006
3.5%
0.009999999776 85
0.1%
0.01999999955 65
< 0.1%
0.02999999933 51
< 0.1%
0.03999999911 53
< 0.1%
0.05000000075 20
< 0.1%
0.05999999866 48
< 0.1%
0.0700000003 27
< 0.1%
Value Count Frequency (%)
244.9700012 1
< 0.1%
243.3500061 1
< 0.1%
242.5 1
< 0.1%
242.3600006 1
< 0.1%
240.75 1
< 0.1%
240.6199951 1
< 0.1%
240.1100006 1
< 0.1%
238.8699951 1
< 0.1%
237.1799927 1
< 0.1%
235.4400024 1
< 0.1%

Interactions

2022-07-04T19:11:53.154477 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:11:39.274871 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:11:41.179520 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:11:43.100690 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:11:44.995605 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:11:47.143058 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:11:49.105822 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:11:51.046036 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:11:53.387293 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:11:39.501538 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:11:41.412755 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:11:43.331368 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:11:45.236327 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:11:47.386115 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:11:49.343347 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:11:51.285143 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:11:53.623578 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:11:39.742607 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:11:41.651172 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:11:43.569753 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:11:45.480323 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:11:47.631920 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:11:49.589350 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:11:51.523080 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:11:53.849350 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:11:39.972669 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:11:41.886043 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:11:43.798517 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:11:45.906563 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:11:47.869218 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:11:49.824523 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:11:51.753824 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:11:54.091885 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:11:40.216257 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:11:42.134243 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:11:44.042519 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:11:46.156653 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:11:48.121968 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:11:50.072375 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:11:51.999320 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:11:54.336700 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:11:40.459689 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:11:42.387143 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:11:44.287864 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:11:46.407918 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:11:48.374704 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:11:50.320188 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:11:52.437638 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:11:54.572715 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:11:40.702408 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:11:42.628613 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:11:44.529234 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:11:46.653072 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:11:48.622844 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:11:50.560738 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:11:52.680500 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:11:54.808946 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:11:40.946424 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:11:42.865598 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:11:44.767714 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:11:46.899951 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:11:48.868836 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:11:50.812283 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T19:11:52.919587 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/

Correlations

2022-07-04T19:12:03.660834 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient ( ρ ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r . It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y , one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-07-04T19:12:03.984200 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient ( r ) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r .

To calculate r for two variables X and Y , one divides the covariance of X and Y by the product of their standard deviations.
2022-07-04T19:12:04.307786 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient ( τ ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y , one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-07-04T19:12:04.614876 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here .
2022-07-04T19:12:04.820392 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here .

Missing values

2022-07-04T19:11:55.191107 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
A simple visualization of nullity by column.
2022-07-04T19:11:55.866268 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2022-07-04T19:11:56.589198 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.
2022-07-04T19:11:56.988571 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
The dendrogram allows you to more fully correlate variable completion, revealing trends deeper than the pairwise ones visible in the correlation heatmap.